Estimation in a fluctuating medium and power-law distributions 
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We show how recent results by Bening and Korolev in the context of estimation, when linked with a classical 
result of Fisher concerning the negative binomial distribution, can be used to explain the ubiquity of power law 
probability distributions. Beck, Cohen and others have provided plausible mechanisms explaining how power 
law probability distributions naturally emerge in scenarios characterized by either finite dimension or fluctuation 
effects. This paper tries to further contribute to such an idea. As an application, a new and multivariate version 
of the central limit theorem is obtained that provides a convenient alternative to the one recently presented in [S. 
Umarov, C. Tsallis, S. Steinberg, cond-mat/0603593 |. 



I. ESTIMATION IN A FLUCTUATING CONTEXT 

Beck, Cohen, and others have provided strong indications 
concerning the way in which power law probability distribu- 
tions (PDs) naturally emerge in scenarios characterized by ei- 
ther finite dimension or fluctuation effects 1 1, 2, 3, 4, 
In a parallel vein, we wish here to offer a purely statistical 
argument to the same effect, that clearly exhibits a simple 
mechanism that operates so as to lead to the appearance of 
these PDs. This mechanism is based on a more general result 
issued from Estimation Theory ||^|^. 

In the conventional estimation scenario, a series of indepen- 
dent random data {Xi} is observed; their common distribu- 
tion is supposed to belong to a parameterized set of distri- 
butions {Pg; 9 € 8}. A statistics T„ (Xi, . . . , Xn) is defined 
as a measurable function of the observed data. This statistics 
is called asymptotically normal if it verifies the following 
property: there exist functions 5 (9) and t {9) such that the 
distribution 

Pe{S{9)V^{Tn-t{9)) <x} 

converges weakly to the normal distribution as n — > +oo. 
Normal statistics are ubiquitous in the real world, sample 
mean and maximum likelihood estimators being notable ex- 
amples. 

Before proceeding we recall that 

• the Gamma distribution function with scale parameter 
a and shape parameter A is defined as 



Ga.xix) 



else. 



(1) 



the d— variate t-distribution or Student's t-distribution 
with 7 degrees of freedom F^{xi, . . . , Xd) is a prob- 
ability distribution that arises in the problem of esti- 
mating the mean of a normally distributed population 
when the sample size is small. It writes {x and y below 
are d— dimensional vectors and the superscript t denotes 



transposition) 

F^{xi, . 
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Assume now, and this is our critical point here, that a random 
number of data is available to build statistics T. This is in fact 
often the case in real physical experiments such as multiparti- 
cle detection 1 10, 1 l^JJJ or photon statistics 1 13, 14, 15, 16]. 
One assumes that there exists a family of integer valued ran- 
dom variables {Nn] that are independent of the observed data 
{Xi}. We say that Nn ^ oo in probability as n ^ oo if, 
Vif > 



Km Pr{iV„ >K} = 1. 



(3) 



In such a situation, a notable result of Bening and Korolev's 
becomes applicable |l7. Theorem 2.1]; we give here an im- 
mediate multivariate version of this result as specified by the 
following 

Theorem 1 Let j > be arbitrary and let {rfn}„>i be some 
infinitely increasing sequence of positive numbers. Suppose 
that Nn oo in probability as n oo with respect to any 
probability frorn a family [Pe] 9 G O}. Let the statistic T„ G 
W be asymptotically normal. In order to have, for any G O, 

Pe [5 (0) Vd^^{TN^ - t {9)) <x}^F^{x),n^ 00, 

(4) 

where Fj (x) is the multivariate Student t-distribution func- 
tion with 7 degrees of freedom, it is necessary and sufficient 
that for any 9 Cz O, 

Pe {Nn < dnx} =^ (a;) , n — > 00. (5) 

The proof is immediate from the univariate version in 
Theorem 2.1] and relies on the stochastic representation of a 
multivariate t-distributed random vector X with 7 degrees of 
freedom as the Gaussian mixture 

^ = (6) 
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where G is a multivariate Gaussian vector and a is a scalar 
random variable independent of G following a Gamma distri- 
bution with shape parameter 7/2. 

As an example of a family of random number of data {Nn} 
that verifies condition (|5j, Bening and Korolev provide the 
negative binomial (or Pascal) distribution: given a positive 
real r and a probability < p < 1, this distribution is 



Pr{7V = k} 



k 
k 



p^(l-p)\fc = 0,l,. 



(7) 



Note that this distribution was independently characterized 
in ^l^l as the particles multiplicity distribution that ensures 
that the energies associated with these particles follow a 
g— exponential distribution. In order to give an interpretation 
for N, recall at this point the Bernoulli process, one of the 
simplest yet most important random processes in probability. 
Essentially, it is the mathematical expression of coin tossing, 
but because of its wide applicability, it is usually stated in 
terms of a sequence of generic trials that satisfy the following 
assumptions: (i) Each trial has two possible outcomes, generi- 
cally called success and failure, (ii) The trials are independent 
(the outcome of one trial has no influence over the outcome of 
another one), (iii) On each trial, the probability of success is 
p and the probability of failure is 1 — p. If in Eq. (0 above 
we assume that r E N, then N can indeed be interpreted, in a 
series of Bernoulli trials, as the number of failures necessary 
to obtain a final N-th success after r — 1 successes. If p is 
chosen according to 



P 



1 

n 



(8) 



then convergence as in holds with dn = rn and 7 ~ 2r. 
The connection with power law probability distributions be- 
comes now immediate. We only have to remember L2(1 .21. 



(9) 



g2|,|23|] that the g-Gaussian distributions eq{x x) with 



eg{x) = [1 + (1 - g)a;]i/(i-«) ; x G R; q G 



are power-law distributions that maximize Tsallis entropy un- 
der CO variance constraint Exx*^ = (xx*) ~ K and that, for 



I <q< 



d + 4 
d + 2' 



(10) 



they coincide with d— variate Student t-distributions with r de- 
grees of freedom, provided that 



2-d{q- 1) 
9^1 



(11) 



One then concludes that an asymptotically normal statistics 
becomes an asymptotically q-Gaussian statistics if it is built 
upon data whose number fluctuates according to a negative 
binomial distribution, as in Q. 



II. THE NEGATIVE BINOMIAL DISTRIBUTION 

The shape of the negative binomial distribution and its depen- 
dence on parameters p and r are illustrated by Figures^and^ 
below. In Figure[2the parameter r is fixed to r = 5 while pa- 
rameter p takes values in the set {1/10, 1/15, 1/20, 1/25}. In 
Figure|2the parameter p is fixed top = 1/2, while parameter 
r takes values in the set {5, 10, 15, 20, 25}. The appearance of 
the negative binomial in this context can be justified by ref- 
erence to the following idea of Fisher f^S*]: let us consider a 
discrete random variable P that follows a Poisson distribution 
with parameter A: 



cxp(-A) A" 



Now let us assume that parameter A is itself a random vari- 
able: we quote below Fisher in verbatim fashion |25|: 

"Since A is necessary positive, the simplest 
frequency distribution which allows some varia- 
tion of A is the Eulerian distribution, familiar as 
that of x'^ , in which the frequency element is 



df 



(r-1) 



"A" 



'^/Pd\. 



For the parameter r is always the half of a 
positive integer; in general it may be any number 
exceeding zero". 

We thus deduce that a negative binomial distribution is noth- 
ing but a Poisson distribution whose random parameter A fol- 
lows (itself) a Gamma distribution with parameter 2r: note 
that this property is cited in both 1 12] and 1 17], but without 
mentioning Ref . ] 25 ] . 

Our central point here is the following: in a great number of 
experiments one observes events that are Poisson-distributed 
with a device that necessarily exhibits, like all instruments, 
positive fluctuations originated in a number of independent 
sources 18]. This is a fact of life that justifies the negative 
binomial distribution of the observed data, and a posteriori 
(asymptotically), the q-Gaussian distribution of any statistics 
associated to these data. 



III. APPLICATION TO RANDOM SUMMATION 

As a rather important application of the above results, we ex- 
plicit the following consequence, that provides us with an al- 
ternative approach to the intriguing problem of the existence 
of a central limit theorem for power-law distributions ]24]. A 
classical statistics is the sample mean estimator defined as 



n 

n ^ — ^ 



i=l 



where G R , 1 < i < n are independent random vec- 
tors with finite covariance matrix. It is well-known that this 
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statistics is asymptotically normal. Applying now the preced- 
ing results we deduce that, if the number of data used in this 
statistics is itself a random variable Nn following a negative 
binomial distribution with parameters r and p = then, un- 
der condition (|8}, the resulting sample mean statistics 

T= (12) 

2 — 1 

is asymptotically (as n +oo) q-Gaussian with r degrees of 
freedom. 

This result constitutes a multivariate form of the central limit 
theorem and represents an alternative to the recent and inter- 
esting result of Ref. 1.241 . where a certain type of dependence 
between the data, called q-independence, is shown to ensure 
the asymptotic (7-"Gaussianity" of the sample mean statistics. 
We underline that in the result presented here, no special type 
of dependence is required, since its conditions of application 
coincide with the conditions required in the usual central limit 
theorem. 




IV. CONCLUSIONS 



We have shown, by applying results of Bening and Korolev 
iHtIi . that q-Gaussian distributions necessarily emerge in the 
very general context of estimation theory. 

More specifically, if an asymptotically normal statistics is 
used with a random number of data that follows a negative 
binomial distribution with parameters p — \/n and r, then 
the resulting statistics is in fact g-Gaussian-distributed, with r 
degrees of freedom as given by M 11 1, parameter q belonging 
to the range of values jlO> . With reference to this q— range, 
precise indications as to which is the "correct" value in a 




FIG. 2: the negative binomial distribution for p = 1/2 and r = 
5, 10, 15, 20 and 25 



FIG. 1: the negative binomial distribution for r = 5 andp =1/10 
(back), 1/15, 1/20 and 1/25 (front) 



given scenario are still the subject of intense debate f2ll l23ll . 
Our present results may also be construed as a rather signi- 
ficative contribution to such debate. 
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